Prediction of the axial compression capacity of stub CFST columns using machine learning techniques

Concrete-filled steel tubular (CFST) columns have extensive applications in structural engineering due to their exceptional load-bearing capability and ductility. However, existing design code standards often yield different design capacities for the same column properties, introducing uncertainty for engineering designers. Moreover, conventional regression analysis fails to accurately predict the intricate relationship between column properties and compressive strength. To address these issues, this study proposes the use of two machine learning (ML) models—Gaussian process regression (GPR) and symbolic regression (SR). These models accept a variety of input variables, encompassing geometric and material properties of stub CFST columns, to estimate their strength. An experimental database of 1316 specimens was compiled from various research papers, including circular, rectangular, and double-skin stub CFST columns. In addition, a dimensionless output variable, referred to as the strength index, is introduced to enhance model performance. To validate the efficiency of the introduced models, predictions from these models are compared with those from two established standard codes and various ML algorithms, including support vector regression optimized with particle swarm optimization (PSVR), artificial neural networks, XGBoost (XGB), CatBoost (CATB), Random Forest, and LightGBM models. Through performance metrics, the CATB, GPR, PSVR and XGB models emerge as the most accurate and reliable models from the evaluation results. In addition, simple and practical design equations for the different types of CFST columns have been proposed based on the SR model. The developed ML models and proposed equations can predict the compressive strength of stub CFST columns with reliable and accurate results, making them valuable tools for structural engineering. Furthermore, the Shapley additive interpretation (SHAP) technique is employed for feature analysis. The results of the feature analysis reveal that section slenderness ratio and concrete strength parameters negatively impact the compressive strength index.

1.Many researchers directly used axial strength as the output parameter even when its statistical distribution is skewed and biased, without further manipulation or considering its impact on model performance.2. Existing studies often utilize the entire global database consisting of short and long columns for training/ testing ML models.However, the distinct failure mechanisms of long and stub CFST columns can affect the relationships between inputs and strengths.Ipek et al. 28 conducted a sensitivity analysis to evaluate the performance of the developed ML models using a global database.It was observed that the performance of these models deteriorates for length-to-depth ratios between 2 and 4 while consistently performing well for larger ratios.Additionally, as highlighted by Hou and Zhou 20 , the division of databases into long-column and stub-column subsets significantly enhanced the accuracy of ML methods instead of using the global database.Therefore, this study focuses only on predicting the axial capacity of short columns.3.While most studies focus on using ANN to predict the axial compression strength of CFST columns, other supervised ML algorithms, such as SVR, GPR, symbolic regression, and tree-based ML algorithms, are less commonly employed.4.Although the ANN model can introduce design formulas, the resulting formulas include numerous weights, biases, and transfer functions, which are not suitable for engineering practice 16 .5. As reported in the literature in Table 1, the predicted formulas for designing CFST columns using GA and GEP are efficient and compatible with experimental results.However, a significant drawback is that many of the provided formulas are complicated, unit-dependent, and lack explanations.This paper introduces a novel model to derive simple, practical, unit-independent expressions for predicting the axial compression of CFST columns.
This research collects an extensive experimental database of 1316 specimens from diverse research papers, including circular, rectangular, and double-skin stub CFST columns under axial load without eccentricity.Eight data-driven models are developed, including Gaussian process regression (GPR), symbolic regression (SR), support vector regression optimized with particle swarm optimization (PSVR), artificial neural networks (ANN), XGBoost (XGB), CatBoost (CATB), Random Forest (RF), and LightGBM (LGBM) models.The axial loads reported from the experimental results are normalized to enhance the performance of the ML models.In addition, the proposed formulas are introduced for designing each column type.The hyperparameter tuning of the introduced ML models is performed using the Bayesian Optimization (BO) technique.

Dataset description
In this section, a comprehensive experimental database containing 1316 column specimens has been carefully selected from research papers focusing on axially loaded stub CFST columns without eccentricity.The loading and geometric configuration of the specimens are illustrated in Fig. 1.All collected tests were conducted on CFST short columns (with length-to-width ratios smaller than or equal to 4.0 7,8,18 ) under monotonic loading and without internal rebar reinforcement.Only samples loaded uniformly across the entire cross-section are considered in the dataset.The database gathered includes the following: (1) Dataset 1 comprises 674 observations with five input parameters related to circular CFST (CCFST) columns; (2) Dataset 2 involves 396 observations with six input parameters relevant to rectangular CFST (RCFST) columns; and (3) Dataset 3 contains 246 observations and involves seven input parameters associated with double-skin CFST (CFDST) columns.
The information presented in Table 2 summarizes the details of the collected specimens, including the outer steel tube diameter (D in mm) for circular CFST and CFDST columns, the outside diameter of the inner steel tube (D i in mm) for CFDST columns, the outer steel tube width (B in mm) and outer steel tube depth (H in mm) for rectangular CFST columns, the thickness of the outer steel tube (t in mm), the thickness of the inner steel tube (t i in mm), the compressive strength of the core concrete (f′ c in MPa), the yield strength of the outer steel tube (f y in MPa), the yield strength of the inner steel tube (f yi in MPa), and the column length (L in mm).These parameters are assumed to directly influence the axial capacity (P u ) of CFST columns of 1316 observations.Naser et al. 25 suggested that the remaining material properties of concrete and steel, i.e., Young's modulus of steel (E s ) and concrete (E c ) and the ultimate strength of steel (f u ), have no significant influence on the training of data-driven models.Table 2 illustrates the statistical distributions of the collected datasets.
Generally, using approximately normally distributed data for machine learning algorithms results in more stable and reliable models.As shown in Fig. 2a, the axial capacity distribution is not normally distributed with extreme skewness for CCFST columns, deteriorating the performance of machine learning models.Therefore, the authors proposed a dimensionless strength index, denoted by p si , as the main output parameter, extracted from normalizing the axial load by dividing the column capacity by the sum of the individual strengths of its components: the steel tubes and core concrete, as defined in Eq. (1).
Table 1.Summary of previous ML models in predicting the strength of axially loaded stub CFST columns.* The remaining parameters P i have similar expressions to P 1 .+ The expression provided is for circular columns.Similar expressions are introduced for rectangular and circular columns using GA and GEP.where A s and A c are the outer steel tube and concrete areas, respectively.Note that for CFDST columns, the contribution of the inner tube, A si f yi is added to the nominal column capacity, N pl , in the above equation, where A si is the area of the inner steel tube for CFDST columns.The strength index can reflect the confinement efficiency of the CFST column, i.e., a relatively high value of the strength index (p si > 1.0) indicates high confinement exerted (1)  by the outer tube.High confinement exerted by the outer steel tube enhances the actual triaxial strength of inner concrete compared to its uniaxial strength f c ′.As depicted in Fig. 2b and Table 2, the statistical distribution of the strength index closely resembles a normal distribution.The proximity of strength index values to 1.0 and its physical and dimensionless nature make it easily predictable and interpretable.
The most critical parameter that controls stub column stability is the local slenderness coefficient, λ, defined in Eq. ( 2) for circular and rectangular tubes 29 , as follows: As shown in Table 2, the database covers a wide range of steel section slenderness, including all compact (λ ≤ 0.15 for circular tubes, λ ≤ 2.26 for rectangular tubes), noncompact (0.15 ≤ λ ≤ 0.19 for circular tubes, 2.26 ≤ λ ≤ 3.0 for rectangular tubes) and slender (λ > 0.19 for circular tubes, λ > 3.0 for rectangular tubes) columns, as recommended by AISC360-22 29 .In addition, the database encompasses a wide range of concrete and steel strengths.As shown in Table 2, the database includes both traditional materials (with f c ′ values below 70 MPa and f y values below 460 MPa, as suggested in AISC 360-22 29 ) and higher strength classes (with f c ′ up to 190 MPa and f y up to 1153 MPa).It should be noted that most design codes of practice impose limits within their scope of application 29,30 .These restrictions are related to the strengths of steel and concrete materials and the slenderness of steel sections.
Furthermore, Fig. 3 visually presents the correlation matrices of both the input and output variables.As displayed in Fig. 3, the correlation coefficient between any pair of input variables is relatively weak (ρ < 0.5), except for the correlations between the outer dimensions of the tube and the column length.In addition, there is a strong relationship between the dimensions of the columns and their axial capacity, which may reduce the performance of the ML training process.However, the correlation between the dimensions and the strength index, p si , is less significant, nearly positive for tube thickness and negative for outer dimensions of the columns and section slenderness, as decreasing the outer dimensions-to-thickness ratio enhances the confinement behavior of stub columns.The yielding strength of the outer steel tube has a negligible impact on the strength index.In contrast, concrete compressive strength is inversely correlated to the strength index for circular and CFDST  www.nature.com/scientificreports/columns and has a negligible effect on rectangular columns.The high observed correlation for circular sections refers to the ductile behavior of using low-strength concrete.In contrast, the low correlation for rectangular sections refers to the general low confinement provided by the steel tube with a rectangular shape compared to the circular-shaped sections.

Gaussian process
Gaussian processes (GPRs) 10 are an ML method based on Bayesian learning principles.GPR constructs a Gaussian distribution over functions, as defined in Eq. ( 3), and observed data points inform this distribution.This technique can effectively handle uncertainty, adapt to noise and complexity levels, and prevent overfitting.
where f (x) is the function distribution at input x , m(x) is the mean function, and K x, x ′ is the covariance (kernel) function determining the covariance between any inputs x and x ′ .A combination of kernels, including the Gaussian kernel, Matern kernel, and periodic kernel, are utilized together to capture the different aspects of the data, such as the overall level, smoothness, noise, and variations.The kernel parameters are optimized by maximizing the log-marginal-likelihood 10 .Given observed input-output pairs, GPR allows predictions for new inputs by inferring a Gaussian distribution over functions as follows: where the posterior distribution p f (x)|X, y is also a Gaussian distribution with a posterior mean function µ p (X) and a posterior covariance function � p (X) defined as follows: where µ p (x) and � p (x) define the mean prediction of the new input point x and the uncertainty (variance) associ- ated with each prediction.The flow chart of the GPR model is illustrated in Fig. 4a.
The GPR model can introduce confidence intervals for prediction outcomes, as illustrated in Fig. 5.This direct quantification of uncertainty enhances its applicability in guiding practical design considerations.The even distribution of the predicted column strength around the measured strength, as depicted in Fig. 5, further substantiates GPR's accurate predictive capabilities for stub CFST column strength.

Symbolic regression and proposed equations
Symbolic regression (SR) 31,32 is a supervised learning task and a genetic programming technique 12 aiming to discover simple and interpretable mathematical expressions that best fit a given dataset by exploring a predefined space of analytic expressions and mathematical functions.SR problems are solved as multi-objective optimization problems, balancing prediction accuracy and model complexity.SR algorithms often use techniques such as genetic programming to improve candidate mathematical expressions by applying the principles of natural selection and evolution to refine the expressions until satisfactory models are found iteratively.In this study, a recent Python library called PySR 33 is employed to predict mathematical expressions for the axial capacity of stub columns. (3) Flow charts of the introduced ML models.
The SR algorithm starts building an initial population with a random combination of operational symbols or functions (e.g., +, −, /, * , ^ etc.) and terminals, such as input variables and constants, to generate a tree-liked expression for each individual in the population.Individuals are selected in a probabilistic way, giving more possibilities to the best and making it possible for the worst to be selected.Otherwise, if only the best expressions were selected, the algorithm would converge prematurely, making all the populations equal.Consequently, a great part of the search space would be stopped from being explored, and the search would be intensively carried on in a small region only.The selected individuals are mutated or crossed over to produce a new generation of populations, using the fitness function to choose the best individuals in each population generation.The mutation process consists of varying a node at random by replacing a function (Fig. 6a), a terminal (Fig. 6b), or an entire subtree with another random node or subtree, while the crossover operation performs cross-swapping of two subtrees selected randomly in a pair of individuals (Fig. 6c).
In SR modeling, error minimization, and simplicity are key objectives of the fitness function.The fitness function is defined as 33   where l pred (E) is the prediction loss (chosen as the Mean Absolute Error), C(E) is the complexity of the expression E, defined as the total number of nodes in the expression, and frecency[C(E)] is a combined measure of frequency and recency of the expression occurring at complexity C(E) in the population, which is used to avoid excessive growth and redundancies in expressions produced by the SR model.Table 3 specifies the parameters of the SR model used in generating expressions.The main procedures of the SR are introduced in Fig. 4b.
The process of selecting the optimal equation requires many iterations and a thorough exploration of each iteration.These iterations involve trying various custom functions, a wide range of operators, and exhaustive combinations of unitless input variables, which have a potential influence on stub column strength, such as the confinement factor ξ, local slenderness ratio λ, global slenderness ratio , and cross-section dimension ratios (L/D, L/B, H/B, D i /D).Unlike the approach commonly found in the literature 22 , where unit-dependent inputs were used for axial strength prediction using Gene Expression Programming (GEP), the previously mentioned inputs are unitless to enhance the robustness and interpretability of column behavior by avoiding any potential issues related to unit dependencies.The equation derived from each iteration has undergone comprehensive evaluation, simplification and refinement to achieve a concise, understandable, and accurate function.The selection criteria carefully balance various aspects, including equation complexity, accuracy, interpretability, and the sensitivity of its output to variable changes.For circular CFST columns, the following equation is extracted: where and are the local slenderness and global slenderness ratios, respectively, defined as follows: where E s I s and E c I c are the flexural stiffness of steel and concrete parts.
Regarding the rectangular CFST columns, the proposed equation is: For double-skin CFST circular columns, the equation is: , α = D i D , and 0.2 ≤ α ≤ 0.85.The proposed equations establish a comprehensive and simple framework with meaningful physical interpretations for predicting the axial capacity of various CFST columns.In the context of the circular column formula in Eq. ( 8), it is evident that increasing the square of global slenderness , local slenderness λ or the f c ′/f y ratio reduces axial capacity.Concerning the rectangular section equation in Eq. ( 9), the axial strength of the composite column decreases with increasing local slenderness λ or H/B ratio; for the double-skin CFST columns formula in Eq. ( 10), an increase in the confinement ratio reduces the column capacity.These observations align with the experimental behavior of CFST columns.In addition, the provided equations are simple and unit-independent and have physical meaning compared to the previous studies in Table 1.

Data preprocessing and hyperparameter optimization technique
The min-max scaling technique is employed for data normalization to reduce the negative impact of multidimensionality.The grid searching technique is utilized for tuning the models' hyperparameters during the training phase, and fivefold cross-validation is employed to mitigate the overfitting issues.After normalization, datasets were divided into two distinct training and testing subsets.The objective of segregating testing subsets ( 8) www.nature.com/scientificreports/ was to assess how well the trained models perform on the new unseen datasets.As widely reported by many studies 13,14,34 , eighty percent of the original dataset is allocated randomly for training, leaving the remaining 20% for testing.To compare and evaluate the effectiveness and reliability of the introduced models, six different ML models, including the support vector machine integrated with particle swarm optimized (PSVR) 35 , Artificial Neural Network (ANN), XGBoost (XGB), CatBoost (CATB), Random Forest (RF), and LightGBM (LGBM) models, were introduced.All the introduced ML models were constructed and evaluated using the same training and testing subsets for a fair comparison.The performance of most ML algorithms largely depends on their hyperparameters, which are predefined before model training.Properly tuning these hyperparameters is necessary to guarantee the optimum prediction performance.Searching for the optimum hyperparameters involves trying out different values for each and selecting the combination that introduces the best performance on the validation data.Using traditional techniques, i.e., grid search (GS) and random search (RS), is time-consuming, especially for large search spaces with numerous hyperparameters.In contrast, Bayesian Optimization (BO) models using the surrogate function, i.e., Gaussian process, random forest, and tree-structured Parzen estimators models (TPE) 36 , guide the selection of the next hyperparameter value based on the previous results from tested hyperparameter values.This approach minimizes unnecessary evaluations, enabling BO identify the optimal hyperparameter combination in fewer iterations than the GS and RS methods 37 .In this study, we adopted the TPE model 36 to optimize the introduced ML models due to their robustness compared to other surrogate functions 37 .Mean Absolute Percentage Error, MAPE is chosen as the objective function in the validation dataset.The expected improvement (EI) of TPE, defined in Eq. ( 11), builds a probability model of the objective function and uses it to select the most promising hyperparameters to evaluate in the true objective function 36 : where z is the hyperparameter combination chosen from the search space, and s * is a threshold chosen to be some quantile γ of the observed s values, so that p(s < s * ) = γ .Additionally, l(z) and g(z) correspond to two distinct distributions: one where the objective function values are below the threshold, l(z), and another where the values exceed the threshold, g(z).To maximize EI, TPE focuses on drawing samples of hyperparameters with the maximum l(z)/g(z) ratios, from Eq. ( 11).Finally, cross-validation was applied to assess the introduced models' effectiveness, avoid overfitting, and obtain accurate predictions for the testing data.

Performance and results of ML models
The scatter plots in Fig. 7 illustrate the relationship between experimental and predicted outcomes for various ML models applied to training and testing datasets for columns with different cross-section shapes.It can be observed that the data points tightly gather around the diagonal line for most of ML models, signifying a strong alignment between the model predictions and experimental results.This alignment signifies the reliability and prediction accuracy of the developed models.Table 4 introduces evolution metrics to assess the performance of the established ML models, including the mean (μ), coefficient of variance (CoV), coefficient of determination (R 2 ), root mean squared error (RMSE), the mean absolute percentage error (MAPE), and a20-index, defined as follows: where y i and y i are the actual and predicted output values of the i-th sample, respectively, y is the mean value of experimental observations, and n is the number of specimens in the database.The a20-index 16,38 measures the percentage of samples with actual to prediction ratio, y i /y i , falling within the range of 0.80-1.20.All data generated and algorithms introduced in this study are included in the supplementary file.
As shown in Table 4, all introduced ML models display mean μ, R 2 , and a20-index values close to 1.0 and small values for CoV, MAPE%, and RMSE for different cross sections.The prediction results of all introduced (11) www.nature.com/scientificreports/models exhibit CoV less than 0.076, MAPE% lower than 6%, and RMSE less than 552 kN, indicating minimized scattering in the prediction results compared to the experimental results.Table 4 reveals that the CATB, GPR, and XGB models introduce the best evaluation metrics for the testing subsets, with MAPE% values equal to 1.394%, 1.518%, and 2.135% for CCFST, RCFST, and CFDST column datasets, respectively.In addition, PSVR can accurately predict the capacity of stub CFST columns with MAPE% values equal to 2.497 and 5.151 for CCFST and RCFST columns, respectively.The superior predictive capability of PSVR demonstrates that the SVR incorporating the metaheuristic optimization methods 39 like the PSO algorithm, can significantly enhance the performance of the SVR model.Furthermore, the evolution metrics of the testing resemble those of the training set, except for the GPR and CATB models.However, the performance of GPR and CATB models in the testing set is comparable to that of the remaining data-driven models and even better than that of other ML models.In addition, when examining the R 2 value and a20-index for the entire dataset, it was found that they are nearly identical to those of the test and training subdatasets.Such robust and stable alignment between the performance of sub-datasets signifies a minimal occurrence of overfitting during the training process of the models.
Although the GPR, CATB, XGB models stands out with significantly superior results compared to other models, extracting an explicit design formula from these models is a challenging task.In contrast, the proposed equations extracted from the SR algorithm offer a distinct advantage by providing simple and practical explicit design formulas, making them more accessible and easier to interpret, even with slightly lower accuracy than the introduced ML models.Although ANN could provide accurate and explicit formulas for strength prediction, utilizing the network in engineering design might not be practical due to the lengthy formulas of the ANN model.
The compressive strength predictions of CFST columns by the proposed equations were compared with the existing code formulas, including EC4 30 and AISC360 29 for different types of columns.As observed in Table 4, for all types of CFST columns, the proposed equations attain a mean, R2, and a20-index nearly equal to 1.0 with CoV less than 0.076 and MAPE% less than 5.9, while EC4 and AISC360 show CoV larger than 0.091, 0.168 with MAPE% larger than 7.1% and 15%, respectively.In addition, the AISC360 predictions, compared to EC4 predictions, appear to overestimate the axial capacity for different cross sections with a higher mean approaching 1.20, lower a20-index, and relatively high error indices.The RMSE and MAPE of AISC360 29 predictions are approximately two to six times those of EC4 30 , indicating the better performance of EC4 compared to AISC360.In addition, AISC360 introduces an a20-index with a value approximately 50% lower than that obtained from the EC4 results.This discrepancy could stem from the absence of confinement effect calculations in AISC360 29 .Although all cited codes' standards display a safe design, the error indices introduced by the ML models and proposed equation are significantly small compared to these standards.Specifically, the proposed equations demonstrate superior performance compared to these standards across all evaluation criteria.
Figure 8 displays the prediction errors of the design standards and the developed ML models for different cross sections.It indicates that most of the introduced ML models are more accurate than the design standards, Evaluating the influence of input parameters on axial compressive strength is a critical aspect of designing CFST columns.This study employs the Shapley Additive Explanation (SHAP) method to analyze the impact of input parameters on the strength index 40 .As illustrated in Fig. 9, the summary plot provides the impact of each feature on a model's predictions and defines the relative importance of each feature on the axial strength.Figure 10 displays the SHAP feature importance for each input feature.A feature importance value greater than zero indicates a positive correlation between the variable and the strength index, while a value less than zero signifies a negative impact on the strength index.The SHAP decision plots in Fig. 11 reveal the complex decision-making process of ML models and to observe how the summary plot works globally in predicting axial compressive strength for CFST columns.The compressive concrete strength and the slenderness ratio stand out as the most influential design parameters within the dataset, especially for CCFST and CFDST columns.The remaining variables' feature importance is ranked in descending order.Additionally, it is observed that, except for yield strength f y   and steel tube thickness, t, all other input variables negatively influence or have a mixed impact on the strength index.This suggests that an increase in the outer tube yield strength and thickness will enhance the performance of stub columns, while increasing the section slenderness ratio and concrete strength will negatively impact the compressive strength index.

Conclusion
In conclusion, this study compiled a comprehensive experimental database of 1316 datasets from various research papers, including circular, rectangular, and double-skin CFST short columns under axial loading without eccentricity.The datasets were carefully selected to ensure reliable and consistent results.Normalization of the axial load was performed to enhance the performance of the data-driven models using a unitless variable termed the strength index.Various data-driven models, including Gaussian process regression (GPR), symbolic regression (SR), support vector regression optimized with particle swarm optimization (PSVR), and artificial neural networks (ANN), XGBoost (XGB), CatBoost (CATB), Random Forest (RF), and LightGBM (LGBM) models, were developed and evaluated.In addition, the proposed formulas are presented for designing circular, rectangular, double-skin CFST columns.The following conclusions can be drawn: • The proposed normalization approach of the axial load yields a nearly normal distribution, which improves model performance and robustness.In addition, using the strength index as an output parameter reflects the insights into the level of confinement provided by the outer tube for different column types.• The CATB, GPR, and XGB models stood out as the most accurate and reliable models for strength predictions of CCFST, RCFST, and CFDST column datasets, respectively, while the proposed equations offered simple and practical expressions with acceptable accuracy.• Symbolic regression emerges as a promising methodology for extracting empirical equations endowed with practical applicability and meaningful physical interpretations.• The proposed equations demonstrated their reliability and robustness compared to existing design code standards.• SHAP analysis revealed that an increase in the outer tube yield strength and thickness will enhance the per- formance of stub columns, while increasing the section slenderness ratio and concrete strength will negatively impact the compressive strength index.
In summary, the proposed data-driven models can extract the axial compression capacity of CFST stub columns with reliable and accurate results, making them valuable tools for structural engineers.

Figure 2 .
Figure 2. Frequency histogram of the axial load output and strength index for CCFST columns.

Figure 3 .
Figure 3. Correlation matrix for the CFST columns database.

Figure 5 .
Figure 5. Gaussian process regression on a semilog scale on the y-axis for CCFST columnss.

Figure 6 .
Figure 6.Mutation and crossover operations in SR model.

Figure 7 .
Figure 7.Comparison between proposed equations and ML models for training and testing datasets.

Figure 8 .
Figure 8. Prediction errors of design standards and established ML models.
Regression model: R 2 = 0.926, MAPE% = 13.2, with expression P u = p(D)p f y p f c ′ p(t)p(L) where p(x) is a third-degree to six-degree polynomial function of x = P 1 +P 2 + P 3 + P 4 + P 5 + P 6 where* P 1 = Dt − Dcos t − D i −

Table 2 .
Statistic features of the experimental dataset.

Table 4 .
Comparison of the developed ML models for different column types.